Ranking Relevant Verb Phrases Extracted from Historical Text
نویسندگان
چکیده
In this paper, we present three approaches to automatic ranking of relevant verb phrases extracted from historical text. These approaches are based on conditional probability, log likelihood ratio, and bagof-words classification respectively. The aim of the ranking in our study is to present verb phrases that have a high probability of describing work at the top of the results list, but the methods are likely to be applicable to other information needs as well. The results are evaluated by use of three different evaluation metrics: precision at k, R-precision, and average precision. In the best setting, 91 out of the top-100 instances in the list are true positives.
منابع مشابه
Improving Verb Phrase Extraction by Targeting Phrasal Verbs based on Valency Frames
In the Gender and Work project (GaW), historians are building a database with information on what men and women did for a living in the Early Modern Swedish society, i.e. approximately 1550–1800 [1]. This information is currently extracted by researchers manually going through large volumes of court records and church documents, searching for relevant text passages describing working activities...
متن کاملImproving Verb Phrase Extraction from Historical Text by use of Verb Valency Frames
In this paper we explore the idea of using verb valency information to improve verb phrase extraction from historical text. As a case study, we perform experiments on Early Modern Swedish data, but the approach could easily be transferred to other languages and/or time periods as well. We show that by using verb valency information in a post-processing step to the verb phrase extraction system,...
متن کاملSupport Vector Machines Applied To The Classification Of Semantic Relations In Nominalized Noun Phrases
The discovery of semantic relations in text plays an important role in many NLP applications. This paper presents a method for the automatic classification of semantic relations in nominalized noun phrases. Nominalizations represent a subclass of NP constructions in which either the head or the modifier noun is derived from a verb while the other noun is an argument of this verb. Especially des...
متن کاملExtracting phrases describing problems with products and services from twitter messages
Social media contains many types of information which is useful to businesses. In this paper we discuss automatic extraction from twitter data the descriptions of problems consumer experience with products and services. We first identify the problem tweets i.e. the tweets containing descriptions of problems. We the extract the phrases that describe the problem. In our approach such descriptions...
متن کاملHistSearch - Implementation and Evaluation of a Web-based Tool for Automatic Information Extraction from Historical Text
Due to a lack of NLP tools adapted to the task of analysing historical text, historians and other researchers in humanities often need to manually search through large volumes of text in order to find certain pieces of information of interest to their research. In this paper, we present a web-based tool for automatic information extraction from historical text, with the aim of facilitating this...
متن کامل